- 
                Notifications
    You must be signed in to change notification settings 
- Fork 3.8k
chore(engine): Refactor streams result builder to use column-oriented processsing #19505
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
0720a8a    to
    c9658ff      
    Compare
  
    c9658ff    to
    3e4384d      
    Compare
  
    There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Generally LGTM.
Left a suggestion how to reduce allocations. Also, feel free to add the benchmark code and run the benchmarks for old and new version and post the output of benchstat old.txt new.txt
251bf2b    to
    3d6ac88      
    Compare
  
    There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The benchmarks look great!
ab921b8    to
    761be1c      
    Compare
  
    761be1c    to
    a8d54c8      
    Compare
  
    | @chaudum I updated the implementation to use a slice of structs for rows buffer. Benchmark shows there is no major difference: here's benchstat between the previously reviewed implementation and the latest one (diff): I also tried to use  func TestLabelsBuilder_Playground(t *testing.T) {
	lbs := labels.EmptyLabels()
	b := NewBaseLabelsBuilder().ForLabels(lbs, labels.StableHash(lbs))
        // b.Set(ParsedLabel, "a", "b") // will be returned as part of result.Labels()
	b.Set(StreamLabel, "a", "b") // won't be returned as result.Labels()
	result := b.LabelsResult()
	require.Equal(t, "{a=\"b\"}", result.Labels().String())
}I'm trying to add this support but it takes time and increases the size of the PR so I'd like to have a different PR for it, wdyt? | 
        
          
                pkg/engine/compat.go
              
                Outdated
          
        
      | "github.com/prometheus/prometheus/promql" | ||
|  | ||
| "github.com/grafana/loki/pkg/push" | ||
| "github.com/grafana/loki/v3/pkg/logproto" | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the linter will complain about this line not being in the third import block.
… processing Change streamsResultBuilder.CollectRecord() from row-by-row to column-by-column iteration. This better aligns with Arrow's columnar memory layout, improving CPU cache locality by reading contiguous memory regions.
a8d54c8    to
    3bfcf9c      
    Compare
  
    … processsing (grafana#19505) Change streamsResultBuilder.CollectRecord() from row-by-row to column-by-column iteration. This better aligns with Arrow's columnar memory layout, improving CPU cache locality by reading contiguous memory regions. I also remove IsValid checks given that IsNull checks are present. IsValid seems to be an exact opposite of IsNull (based on this code). The slices and label builders for intermediate rows data are reused between CollectRecord.
What this PR does / why we need it:
Change streamsResultBuilder.CollectRecord() from row-by-row to column-by-column iteration. This better aligns with Arrow's columnar memory layout, improving CPU cache locality by reading contiguous memory regions.
I also remove
IsValidchecks given thatIsNullchecks are present.IsValidseems to be an exact opposite ofIsNull(based on this code).The slices and label builders for intermediate rows data are reused between CollectRecord.
old vs latest version
Special notes for your reviewer:
Checklist
CONTRIBUTING.mdguide (required)featPRs are unlikely to be accepted unless a case can be made for the feature actually being a bug fix to existing behavior.docs/sources/setup/upgrade/_index.mddeprecated-config.yamlanddeleted-config.yamlfiles respectively in thetools/deprecated-config-checkerdirectory. Example PR